defense performance
Supplementary Materials of Random Noise Defense against Query-Based Black-Box Attacks
In this supplementary document, we provide additional materials to supplement our main submission. In Section A, we talk about the societal impacts of our work In Section B, we provide detailed experimental settings as well as further evaluation results on CIFAR-10 and ImageNet. We also provide the comparison with input transformation-based defense methods. In Section D, we give the proofs w.r.t. In Section E, we give the proofs w.r.t. The proofs of Theorem 3 are given in Section F. In Section C, we provide the analysis and evaluation of decision-based attacks. Deep neural networks (DNNs) have been successfully applied in many safety-critical tasks, such as autonomous driving, face recognition and verification, etc. And adversarial samples have posed a serious threat to machine learning systems.
Random Noise Defense Against Query-Based Black-Box Attacks
The query-based black-box attacks have raised serious threats to machine learning models in many real applications. In this work, we study a lightweight defense method, dubbed Random Noise Defense (RND), which adds proper Gaussian noise to each query. We conduct the theoretical analysis about the effectiveness of RND against query-based black-box attacks and the corresponding adaptive attacks. Our theoretical results reveal that the defense performance of RND is determined by the magnitude ratio between the noise induced by RND and the noise added by the attackers for gradient estimation or local search. The large magnitude ratio leads to the stronger defense performance of RND, and it's also critical for mitigating adaptive attacks. Based on our analysis, we further propose to combine RND with a plausible Gaussian augmentation Fine-tuning (RND-GF). It enables RND to add larger noise to each query while maintaining the clean accuracy to obtain a better trade-off between clean accuracy and defense performance. Additionally, RND can be flexibly combined with the existing defense methods to further boost the adversarial robustness, such as adversarial training (AT). Extensive experiments on CIFAR-10 and ImageNet verify our theoretical findings and the effectiveness of RND and RND-GF.
proofs
A.1 Proof of Theorem 1 Before proofing Theorem 1, We first demonstrate the superiority of even-hop neighbors over odd-hop neighbors from the perspective of random walks. In a binary node classification task, denote the probability of a random walk of length k that starts and ends with nodes of the same label as pk,k > 0. Suppose the edge homophily level his a random variable that belongs to a uniform distribution in [0,1] and p1 = h, then: Lemma 1. If k is odd, Eh[pk] = 12. If k is even, Eh[pk] 12. Proof. We now provide a brief discussion of the superiority of even-hop neighbors in multi-class node classification tasks following [14].